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j^ Abstract 

O We propose to identify process zones in heterogeneous materials by tailored statistical tools. 

^ The process zone is redefined as the part of the structure where the random process cannot be 

O correctly approximated in a low-dimensional deterministic space. Such a low-dimensional space is 

' jy>j obtained by a spectral analysis performed on pre-computed solution samples. A greedy algorithm 

^-^ is proposed to identify both process zone and low-dimensional representative subspace for the so- 

f^ lution in the complementary region. In addition to the novelty of the tools proposed in this paper 

Oh for the analysis of localised phenomena, we show that the reduced space generated by the method 
is a valid basis for the construction of a reduced order model. 

m 
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(N 1 Introduction 

en 

f*"*^ In order to predict complex physical phenomena, one has to devise models which describe these phe- 

^sj nomena at the appropriate, sufficiently fine scale. In the case of fracture in heterogeneous structures, 

'""' it is usually observed that models need to be established at the scale of these heterogeneities. For com- 

^ posite laminates for instance, predicting fracture requires an explicit description of the damage process 

within each individual ply, and of the thin interface transition zone between two adjacent plies [Tl[31|3]. 

^ . Similarly, fracture models for concrete need to take into account explicitly the interface transition zone 

J3 between cement and aggregates [H 13 [6] . Models that are built on this observation describe fracture 

at a meso-scale (i.e.: scale of the mesoconstituents). With today's ever increasing computing power 

available to engineering, the simulation of large engineering components using such mesoscale models 

is within reach, albeit at a considerable cost. However, in engineering design processes, a prohibitively 

high number of such simulations may be necessary. One might be interested in the sensibility of the 

fracture process to some design parameters, or in the effect of observed variabilities on the strength of 
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the structure. Consequently, constructing reduced order models based on a mesoscale representation 
of fracture is an issue of tremendous importance in today's computational mechanics. 

Two families of approaches address this problematic from a different angle. Homogenisation-based 
reduced order modelling proposes to use the fact that the solution to multiscale problems can be split 
into two additive contributions: a macroscale contribution, which is smooth and varies slowly in the 
structure, and its microscale counterpart which varies with the fine-scale structural heterogeneities. 
Under the hypothesis of periodicity or ergodicity of the heterogeneities, one can establish a formal 
separation of scale [3 El : the macroscale part of the solution is obtained by solving a macroscale prob- 
lem at the scale of the structure, while the macroscale problem itself is obtained by solving microscale 
problems on small subdomains of the structure over which the effect of the periodic heterogeneities 
can be averaged. In the case of fracture in heterogeneous media, the difficulty is that the assumptions 
of periodicity or ergodicity are violated in the damaged region, which jeopardises the accuracy of this 
class of methods. Filtering out the effect of the crack to retrieve this periodicity is currently an active 
area of research 9, 10, TU [T2 [T3]. Algebra-based reduced order models follow a different approach. 
The idea is to define a subspace of low dimension in which the fine-scale solution to the problem of 
interest is well-approximated. The fine-scale problem is then projected onto this subspace, which yields 
a problem of small dimension, which takes into account the fine-scale features of the problem. One of 
the major difficulties here is of course the identification this subspace. 

Algebra-based reduced order modelling has been extensively studied in the literature in the case of 
linear problems. The extraction of invariant representative subspaces based on the spectral analysis 
of linear problem is the basis for a large class of methods, which comprises modal synthesis, balanced 
truncation and moment matching. The construction of reduced order model based on such eigenanal- 
ysis is relatively well-established (we refer to the review in [H]). In the case of nonlinear problems, 
some breakthrough has been made over the last decade, with the development of reduced order mod- 
elling techniques based on the proper orthogonal decomposition \TE\ [TCI [IH [THl ^^ [2D] , on the reduced 
basis method [2T], or on the a priori hyperreduction method ^2, ,2^. Generally speaking, the idea 
is to extend the concept of extraction of spectral invariants of the problem by considering a subset of 
particular solutions to the fine-scale nonlinear problem, called "snapshot". This snapshot, in the case 
of parametric analysis, is a set of solutions corresponding to particular values of the parameters. In 
the case of time-dependent nonlinear problems, the fine-scale solutions to the first time-steps can be 
used to identify a representative subspace, which in turn serves for the construction of a reduced order 
model for the cheap solution in subsequent time steps. 

However, most of the successful applications of algebra-based nonlinear reduced order modelling 
have been dedicated to mildly nonlinear problems, or more precisely to problems that exhibit a mild 
nonlinear dependency of the fine-scale solution on time or parameters. Fracture mechanics is charac- 
terised by strong nonlinearities in the region where cracks initiate and propagate, which jeopardises 
the identification of representative low-dimensional subspaces (this fact will be discussed in more de- 
tails in the core of the paper). Attempts to exclude process zones (i.e.: the zones where mechanical 
energy is dissipated by damage mechanisms) from the reduction process have been proposed, allowing 
for the construction a reduced order model far away from the source of nonlinearity, while solving 
in the process zone without approximation [531 [21] (a schematic of this general idea applied to the 
case of fracture of random composites is given in figure [I]) . Similar strategies have been developed 
in other contexts where reduced order modelling cannot be efficiently applied in a particular region 
(usually the region of interest) of the structure, using for instance substructuring [551 [HI [13 [55] or 
local enrichment [TT] [52] . So far, the identification of these process zones has been made a priori, from 
physical or empirical observations. The present work is an attempt to provide some objectivity in the 
determination of the zones where algebra-based coarsening should not be performed. 

The method proposed in this paper relies on a very simple idea. The process zone can simply be 
redefined as the part of the domain where reduced order modelling does not provide a satisfactory 
level of accuracy. Let us elaborate on the implications of this observation in the context of fracture of 
random materials. The problem of interest is depicted in figure [l] The distribution of inclusions in a 
particulate composite follows a given probability law. We are interested in finding a low-dimensional 
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Figure 1: Principle of the reduced order modelling for problems with localised random effects while 
excluding regions of the domain where local lack of correlation is observed. The proposed work aims 
at providing an objective methodology to identify these regions and construct a representative reduced 
space for the solution corresponding to the complementary "smooth" domain. 



subspace in which the solutions corresponding to all possible realisations of the particle distribution 
are well approximated. We use the classical idea of the snapshot proper orthogonal decomposition 130] 
to obtain such a subspace. One first computes explicitly the solution corresponding to a few particular 
realisations (the snapshot) of the random particle distribution. The reduced space is then defined as 
the subspace of the space spanned by the snapshots in which the projection of the snapshots is the 
closest from the original sample. Mathematically, the reduced space is obtained by a spectral analysis 
of the space spanned by the sample solutions. The problem is said to be reducible if such a space of 
low dimension indeed allows for a sufficiently small error of projection. Fracture of random materials, 
as will be shown in this work, is not directly reducible in this sense. However, if one excludes the 
process zone from the spectral analysis, an acceptable level of reducibility is obtained in the remainder 
of the domain. Therefore, we can use the observation made previously and define the process zone as 
a region where the error of projection is too large. We keep using the term "process zone" classically 
used in fracture mechanics because the region associated to large local errors of projection is observed 
to correspond to zones where damage might occur in a statistical sense. But we emphasize the fact 
that the usual definition of the "process zone" is not strictly equivalent to the algorithmic redefinition 
used in this paper. To summarise the idea introduced in this paragraph, we look for a reduced space 
and, at the same time, for the associated domain in which the approximation provided by reduced 
order modelling is valid. 

Mathematically, this problem can be formulated as a problem of minimisation of the error of 
projection in the reduced space, the unknowns being the reduced space itself and the process zone. 
We propose to solve it by a greedy-type algorithm. One first perform a classical proper orthogonal 



decomposition of the snapshot where a fixed iterate of the process zone is excluded. In a second step, 
one looks for a small update of the process zone which minimises the error of projection, given the 
reduced space. We arrive at a sub-optimal solution of what we call the restricted POD. 

Using the proposed tool, we find that the approximation error associated with a low-dimensional 
reduced space and a relatively confined process zone is acceptable for engineering applications. In 
addition, the dimensionality of the solution space, far away from the process zone, is relatively well- 
defined: one can easily identify the (small) dimension of the reduced space which contains meaningful 
information. However, the process zone itself is not well-defined. Indeed, the influence of the random 
cracks is felt far away from the physical discontinuities. As a consequence, the error associated with 
the restricted POD decreases at an almost constant rate with the size of the process zone. We conclude 
that there is not one definite process zone, but several valid process zones, each defined by the level of 
error of the reduced order model built in the complementary region. 

The outline of the paper is the following. We describe in Section 2 the damageable lattice model that 
is used to simulate fracture in particulate composites (see some of the related work in A, 31, 32, 5, 33,). 
We briefly describe the random distribution of material properties, and set up the problem of fracture 
in particulate composites. The classical POD and Snapshot POD are presented in Section 3, along 
with cross-validation statistical error estimates [Ml 1311 [35] . We show numerically that the problem of 
interest is locally uncorrelated and review classical solutions to address this type of issues. In section 4, 
we define an extension of these approaches, the restricted POD. A progressive greedy-type algorithm 
meant to find an optimal decomposition associated with a spatial domain of validity is described next. 
We show that the proposed tool is relevant to the problem of fracture of random particulate composites 
in section 5, and discuss the possible improvements and applications in section 6. 

2 Lattice model of fracture in particulate composites 
2.1 Lattice structure 




Figure 2: Definition of the beam network and its associated lattice model 



We consider a two-dimensional lattice occupying a continuous domain fl with boundary dfl. Each of 
the rib lattice beams occupies domain il^^'. Sets of adjacent beams are linked by rigid joints occupying 
domains Q^^^ such that Q = Ubeli nJ ^^''^ ^ Ujeirib+i nt+n l ^^''^ ■ '^j is the number of joints. The 
structure is subjected to prescribed displacements t/^ on the part of its boundary denoted by 951„, 
to prescribed tractions F^^ on the complementary boundary dftf — dfl\dflu, and to a distributed 



body force / , over time interval T = [0,T]. These boundary conditions depend on time, which wiU 
not be written exphcitly, unless necessary. The evolution is supposed quasi-static, isothermal and we 
make the assumption of small perturbations (small displacements and strain) . Let us define a global 
reference frame TZ — (O, x,y, z), where B — {x,y,z) is an orthonormal basis and z is orthogonal to the 
two-dimensional plane. Depending on the context, we will alternatively use the notations TZ = (O, x, y) 
and B = (x, y). 

We consider the limit case where each beam is infinitely slender (in the two-dimensional plane), 
and the domain occupied by the joints is of measure null. We therefore work with a network of 
unidimensional domains ((u;('''')ftg|i „j^|, such that lo = Ubeli n H ^^^'^ ^^ embedded in M? , joining a set 
of grid points denoted by 'P = {£}ig|i.„p|. Let Py''-^ e p and pC*)'^ ^ -p \yQ the two extremities 
of beam (6). Line (p(f').i^ p(h),2) defines the neutral fibre of beam b. F^''^'^ and P^''''^ are ordered 
such that if we define global indices i and j by Pj = Es and P = P} '''^ , we enforce the condition 
i < j. We define the unit neutral axis vector of beam b by n^'^^ — (P*- ''^ — P} ''^)/L^^\ where 
pC*) — ||p( i'"^ — p( )'i||2 is the length of beam b. The definition of a local reference frame attached 
to b is done as follows: 7^('') == {E!:''^'^ ,Tf'\t^''\z) with t^^'^ = zhrfi'\ The local orthonormal basis is 
denoted by B'^^'^ = {rS^\'6'^\z). We will write the following coordinates of arbitrary point Mj- ' of (5) 
in the local reference frame: 

pW-iAfW ^£nW+ 77 #)+zz. (1) 

The orthogonal projection of M^ •* onto semi-segment (P*^ -''^,n*^''-') is denoted by M_ and is the point 
of the neutral fibre at linear coordinate ^. The cross sections (section orthogonal to the neutral fibre) 
of beam (5) at points P^ ''^ and P^ ''^ will be respectively denoted by S''-^'"^ and S''''-'^^. A cross section 
at any other point M^^\^) wiU be S^^\£,)- 

Let us also define the subset of lattice grid points connecting at least two beams J = {-Pi}ig|i,„ .| C 
V (centroids of joints (j)), the set of grid points at which Dirichlet boundary conditions are applied 
Vij — {-Pi}i(E|i.n„] C V (centroids of beam cross sections that belong to dVlu, and centroids of a joints 
whose boundaries intersect d^^) and the set of grid points at which Neumann boundary conditions 
are apphed Vf = {^i}ieii,„^i C V such that Vp^Vu^ J ^V andVp ^Vu ^ {}• 

2.1.1 Euler-Bernoulli approximation 

Let u^^-* be the unknown displacement field in beam (6), which belongs to the space ZY*^^^ of kinematically 
admissible fields defined over Vl^^': 
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{uWeiJi(f^W) I M|ao„ = [/.}' ^^) 



Let U'^ be the associated vector space. Under the assumptions made previously, the weak form of the 
balance equations reads, at any time t G [0, T]: 

find u^''' e U^''^ such that: Vu* e U'^''^'", 

q!-''^ : e(u^) dn ^ f f , ■ u* dn + f F^-u'dT+f a'-''^ ■ fS^^ ■ u* dT , ^^^ 



where a'^''^ is the Cauchy stress tensor and e(M^''') — 1/2 (V(m''''') -f V(u'^''^)'^) is the symmetric part 

of the displacement gradient, and fv''' = in^^' is the outer normal to the beam. The last term in 
equation pi accounts for the reaction forces from the adjacent beams. This balance equation needs to 
be complemented by a constitutive law between a and e (damage in our case), which will be detailed 
later on. 

Let us now define the classical beam approximation of the previous problem on fl^^'. The dis- 
placement M^''^ is searched in a subspace W''^''-' of U'^''^ which satisfies the classical Eulcr-BernouUi 



assumptions: 



U^'"' = [u^'^ e H\m) I .(^)(C,,) = {^^'"^%)'^l^^^^')^^^^ -d n,f{i) =. 0(^)(O} , (4) 



Equation Q defines a rigid body kinematic of eacli of tlie cross sections of tire beam. The displacement 
is therefore uniquely defined by three unidimensional functions f, w and defined on lo^^i. We will 
write q'^^^ — w'-^-' n^''^ + w'^''^ r-^> the displacement of a point of the neutral axis and the local solution 
S(b) = (qW^eW)^ searched in space Q^'') defined by: 



qW = {^C) = (g(''),0W) I gW e i/iC^W) ,0W e i7i(..(^)) , g[g^ = g, , 0|p„ = 0,} 



(5) 



Injecting the lattice approximation of the displacement into the balance equation ([3]), we obtain 
the following homogenised weak formulation: 



Find S^''^ e QC') such that: VS'^'')* £ Q('')'°, 



(6) 
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The generalized strain in equation ( 15 ) is: 



i^'HO 



g(6) 






(7) 



where the generalized stress vector (axial force and moment) is defined as: 



/ 7V(&) 



aW(0 = 



(0 



„W.(^W.„W)d5 \ 



, A/C-) (e) = - / Zi^'') • (aC') • n(^) ) ?? d5 , 



(8) 



the generalised distributed forces and moments and the generalised prescribed forces and moments 
are: 

/ I f AS \ ,- . ( [ F,,dndz \ 
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and the generalised reaction forces and moments read: 



F^^ drj dz 
E-d ' H. 1] dridz 



(9) 
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(10) 
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2.1.2 Rigid joint relations 

Let us consider a joint {j) of centroid Pi ^ J between a set of beams C'^-'^ — {(6), (&'), (6"), ...}. We 
assume that the joint behaves like a rigid body. Hence the displacement y^^^ belongs to space: 



U^^'^ = UeH^{n^^'>) I W|M=9^^'^+OMAz6l(j')} , 



(11) 



Writing that the displacement field must be continuous between the joint and the connected beams, 
we obtain the lattice approximation of the compatibility conditions at point Pf. 






(12) 



As it has been done in the previous subsection for a beam (6), the equilibrium of a joint (j) reads: 
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(13) 



where the expression of the internal virtual work has vanished due to the assumption of rigid body 
motion, and the volume forces are ignored because of the slenderness assumption. Therefore, only the 



terms associated to reaction forces remain. Integrating each of the terms in ( 13 1 along the contact 
planes between the joint and the connected beams, we obtain the following static lattice equilibrium 
relations: 

f 5:(F).,(^)*)i,,..^p^=o 

6GCU) Q4^ 

5:(MWe(^)'^)|pC.,.^P^=0 ' ^ ^ 

k 6GCU) 

2.1.3 Assembled lattice problem 
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Figure 3: Periodic lattice structure with hexagonal repeated pattern as defined in figure. Example of 
boundary conditions are also represented. 

Summing the local weak formulation of the balance equation of the beams and taking the rigid 
joint relations into account, we obtain the following weak form for the lattice problem: 

Find S={q,e)^Q such that: VS"^ G Q", 



where 



b=l 



(!>),• 






p(b).i^-pj 



Q^{{q_,d)^ U (9''\^^")eC°(L.)|V6eIi,nJ,(gW,0(^))eQW 

bell.nbl 



and QP is the associated vector space. 



(15) 



(16) 



2.2 Elastic-damageable constitutive law 



We first describe the case of linear elastic beams, establish the link between the theory of linear beam 
networks and continuous elastic bodies. We then extend these concepts to damageable media. 

2.2.1 Elastic properties 

Consider the linear elastic constitutive law at a point M_ of beam 51^^^: 
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(17) 



where E is the Young modulus of the material, and v is its Poisson ratio. We assume the following 
energy equivalence in a section of the beam: the virtual work of the stress field in a displacement field 
constrained by the Euler-BernouUi assumption is equal to the virtual work of the beam formulation, 
which reads 



V^W^eZY'^W, 



(T^^) : e(u(^)*) dS = a(^) • I'^^HS^''^* {^{M)) 



We obtain the classical linear state law: 

where I^^-' — Jswicim)) ^"^ '^'^ ^^ ^^^ second moment of area of section S^''H£(AI)). 

2.2.2 Link with an elastic continuum in the case of hexagonal regular lattice 
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Figure 4: Periodic lattice structure with hexagonal repeated pattern 

We now consider a periodic plane lattice structure made of a repeated hexagonal pattern as shown 
in figures [s] and |4J The length of the beams in the network L^''\ their thickness h^'''' and depth t'^''^ are 
supposed constant trough the network. By classical homogenisation of the behaviour of the hexagonal 
representative volume element with respect to a point of a 2D continuum assumed in plane stress, and 
under the assumption of zero distributed forces, it is shown that the lattice structure is energetically 
equivalent to a continuum with the following material properties: 
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(20) 



where t is the depth of the continuous structure, which is not necessarily equal to the depth of the 
beams t^''\ 

Unfortunately, this theory, as far as the authors know, has not been extended to damage mechanics. 
It suffers the limitation of classical homogenisation to non-softening behaviours. In the literature, the 
elastic constants are used as a starting point for a phenomcnological damage model, whose parameters 
then need to be fitted to experimental results. We follow the same approach. The damage model used 
in our simulations is described in the next subsection. 

2.2.3 Elastic-damage law 

The model presented here is based on classical damage mechanics [37], applied to lattice structures. 
We introduce two different damage mechanisms, one acting in traction and the other one acting in 
bending. This assumption is consistent with the model used in 5 , where it is argued that damage 
in traction corresponds to damage due to hydrostatic deformations, while damage due to bending 
corresponds to shear damage. 

We postulate the following Helmholtz free energy per unit length of beam (b) : 

^ (e, d) = \ fi?(^)5W(l - d„)^^^^^^ ^/ +i?W/W(l - d,)eA . (21) 

dn and dt are two damage variables ranging from to 1. They account for the non- reversible softening 
of the beam with increasing load, in respectively tension and bending. Compression in this model does 
not dissipate energy, which is mathematically introduced by making use of the positive part extractor 
< . >+. We introduce the compact notation d = (d„ dt)^ . 

The relationship between generalised stress and strain in beam [h] is obtained as follows: 



N\^d^^ (e^^S^'\1 rfn)^^^ 




(22) 



This state law is the nonlinear counterpart of the linear state law (19 1. 

The second state law, which links the damage variables to dual driving thermodynamic forces, 
reads: 



< V_f >, 






At last, an evolution law is defined to fully define the damage evolution: 

Y{t) = max((y„(r))" + hYtiT))")^ , (24) 



T<t 



This damage law is inspired by the model described in |lj for composite laminates. We refer to 
this work for a comprehensive interpretation of the different parameters. We will just notice that Y 
is an equivalent damage energy release rate which governs the evolution of damage with traction and 
banding. The critical value Yc is therefore the "strength" of the beam section. 

2.3 Randomly distributed material properties 

The damageable lattice model is used to derive a three-phase model for concrete. Such models consider 
three different entities: matrix (cement), inclusions (hard particles, assumed spherical) and an interface 
between these two entities (see 6^ for an evidence of the existence of such interface). Plane {0,x,y) 
is a section of the three dimensional particulate composite structure. A projection of the material 
properties onto the lattice model is performed as follows: 



• The size and position of the spherical particles is generated randomly, using the algorithm given 
below 

• Beams whose two extremities belong to either the matrix phase or a single inclusion will be 
attributed respectively the properties of the matrix or of the particles 

• A beam which has its two extremities in different phases will have material properties corre- 
sponding to the interface. 

The particle distribution is generated in a similar way to the one used in [31[51[33|. The probability of 
an arbitrary point of the plane {0,x, y), located in any of the inclusions of the structure, to belong to an 
inclusion whose section by the plane has a diameter smaller than D is given the cumulative distribution 
function Pc{D) identified in [38]. If we define the normalised diameter \>y D — j-j^ , where I?Max is 
the maximum particle diameter, this distribution function Pc{D) is given by the following equation: 

D V' ( D ' 



Pc{D) = 1.065 - 0.053 

V "Max / 

-0.012 ( — ^ I -0.0045 ( -=— I -0.0025 



D V ^ D V f D -'' ^^^> 



'lax 



If we look for a distribution of particles with a discrete set of projected diameters V — {Di}ieli.\ 



n_ 

obtain that the number rii of particles of projected diameter Di is given by: 



ordered in decreasing order such that Di — -Di+i = AD (with AZ) = "'"' a positive number), we can 




V*ell,n,,l, n,^—-^\pA ' ^ \-PA ' M (27) 



where A is the surface of the structure and Pj. is the volume fraction of inclusions in the particulate 
composite. 

To position the particles, we randomly draw the coordinates of the projected centre of each one of 
them, in turn, starting from the particles of largest projected diameter. To avoid particle collision, we 
enforce the constraint that the distance between the projected centre of any two particles must at least 
equal to 1.1 [^ + ^), where Di and Dj are the projected diameters of the two arbitrary particles. 
This constraint is enforced numerically by simply drawing an other random position if the currently 
positioned particle does not satisfy this condition. 

The material properties of the three phases are attributed as follows: 

• a beam that belongs to an inclusion cannot be damaged (yc,inc = oo in the damage model). Its 
elasticity constants are Ei^c and i^inc- 

• a beam that belongs to the matrix phase has elasticity constants -EMat and z^Mat, and damage 
parameters aMat, 7Mat, ^c^Mat, 5^o,Mat and riMat- 

• a beam that belongs to the interface phase has elasticity constants -Eintcr and I'intcr: and damage 
parameters aintcr, 7intcr, ^c, inter, ^1^, inter and rijntor- The interface is, in general terms, softer and 
weaker than the matrix. 

The main drawback of such a model is that it is based on the assumption of plane stress, which is of 
course hardly justified in the case of spherical inclusions. However, a number of numerical investigations 
[H[S] have shown that this type of model is capable of giving a relatively accurate description of fracture 
in concrete structures. In the present study, we are more interested in the numerical behaviour of the 
damage model rather than in his predictive ability. We believe that the results of this paper can be 
extended to other damage models, whether related to concrete structures or not, as long as it exhibits 
the behaviour that will be described later on (e.g.: localisation of the damage process). 
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Figure 5: Two realisations of the random distribution of the material properties. The inclusions 
are spherical, and their projection onto the lattice structure is represented in black. The grey bars 
correspond to the matrix of the particulate composite, while the light grey ones define the weak 
interface between aggregates and matrix. 



In order to describe the randomness of the model formally, we introduce the probability space 
H = (0, J-", P) for the random distribution of the material properties of the lattice network. Q is the 
ensemble of outcomes of the random generation, T is the corresponding Sigma-algebra of subsets of 6 
and P is the associated probability measure. Any random distribution is characterised by distribution 
function (26 1, a fixed maximum inclusion diameter -DMaxi a- fixed phase ratio, fixed material properties 



for each of the phases, and a given lattice network. 

Two realisations of the random distribution of material properties are illustrated in figure [5J 



2.4 Discretisation 



We now present briefly the discretisation technique of the lattice balance equations ( 15 ). The displace- 



ment of a point of the neutral fibre of beam (&) is approximated using a unique finite element, of first 
order degree for the normal displacement, and third order degree for the deflection: 
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where the matrix of shape functions is given by: 
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Hence, the local finite element approximation of the solution fields in bar b can be expressed in the 
global basis B using the transformation: 






/^(.)V)(^)\ 



(30) 



where q is decomposed in B in the form q = qxX + QyH and w ' is the following rotation matrix: 

= " [-{n('Yy (nW)^xj ' ^^^^ 

The approximated generalised strain used above reads: 



V 



ib)\ 






(32) 



In order to ensure the C" continuity of the solution fields {q,0) over w, we introduce a unique 

vector of nodal unknowns U = (9x|Pj %\p^ ^|Pi ••■ 1x\p„ %|p„ ^l^"p ) > with U e M"". 

The vector of nodal unknowns of beam b is denoted by IJ' ' and is obtained from U by the following 
extraction: 

Ijib) ^ aC'^U (33) 



The relationship between V^ -' and U^ ' is: 



yib) ^ R(b)u(fc) 



(34) 



rW = 



Q 1 Q 

2 2 2 ^^^^ 

V 1 y 



We finally obtain the expression of the local solution fields and strains as a function of the vector 
of nodal unknowns: 







(35) 



(36) 



By substitution of the finite element approximation into the balance equations ( 15 ) for solution 5 



and test vector 5* (Galerkin framework), we obtain the semi-discrete system of n„ time-dependent 
equations: 



Vt e r, findU e M"" such that: 



VU* € R"" such that U*(P„) = 0, 

(U*)^ (Fi„t [m,)rem, e) +FE,t(i)) - 0, (37) 



where the vector of internal forces, which depends on the particle distribution, is given by: 






(i>) 



BW'^-aWdC, 



(38) 
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and the vector of external forces is 

rib /. // , ■ x\ Ub (F_^ ■ x\ 

FE.t(t) = Ea'"^ / ^'''" ■\L-y\d^+Y. E -^^'^?p<- • U^ • H ■ (39) 

b=i ■^""" \md ) b=ipm.evi \ Md J 

The two integral terms are evaluated for each beam using a 3-point Gauss quadrature rule. 

2.5 Solution strategy 

For a given distribution of the material properties, the nonlinear solution strategy used to solve the 
problem over time is a classical time discretisation scheme for quasi-static, rate-independent problems, 
associated with the continuation algorithm proposed by |39| to handle the instabilities, which are 
a classical feature of fracture mechanics. This procedure consists in integrating the history of the 
irreversible process by looking for a set of consecutive solutions at some pseudo-times in a discrete 
time space T'^ = (in)raGlo,nt] (with to = and tnt = T), together with the unknown amplitude of the 
external load. 

The introduction of the time discretisation into equation ([3]) at any time t„ e T^ of the analysis, 
and for a given distribution 9 of the material properties, leads to the following fully discrete nonlinear 
vectorial equation: 

Fintffu,,) ^ ,'^)+Mt)#^=Q (40) 

Homogeneous Dirichlct boundary conditions have been assumed at this stage for the sake of concision. 
In order to close the system of equations and define the time evolution of A, an additional constraint 
is introduced. It enforces that the maximum dissipation over all quadrature points of the lattice struc- 
ture is equal to a prescribed value. The system comprising the balance equations and the additional 
constraints are solved by a Newton algorithm with line search. 

2.6 Motivation of the work by a test problem 

An example for the problem of fracture of random composites is depicted in figure [61 The structure 
considered is a 2D damageable heterogeneous rectangular lattice beam. Homogeneous Dirichlet bound- 
ary conditions are applied at the bottom left and right-hand corners of the structure. Compressive 
vertical forces are applied on the top part of its boundary, as illustrated in the figure (Im top). A crack 
initiates in the region where the generalised stress is maximum, and propagates, broadly in the vertical 
direction, to minimise the potential energy stored by the structure. The time history of the irreversible 
damage mechanisms is integrated using 50 time steps. The picture shows the solution obtained at the 
last time step of the analysis. The aggregates are represented in grey, while the colorscale in the matrix 
and interface indicate the level of damage in each of the beams comprising the lattice. 

We have used in our test cases the values Ycjiiter = j^c,Mat, which is why the crack tends to 
follow the aggregate boundaries. The elastic constants of the three phases are all equal. Therefore, 
the heterogeneity of the problem is only due to damage, and is therefore relatively localised. This 
restriction, which is not physical as the aggregates are stiffer than the matrix in real concrete structures, 
will be justified later on. 

The solution corresponding to three other particle distributions are presented below. One can see 
the crack paths differ, but that damage localises in a relatively small region compared to the size of 
the beam. Away from this so-called "process zone" , the solution of the four numerical experiments 
seem qualitatively close, which justifies the search of invariant properties for the random process. 

The idea is then to look for an approximate of solution (U(i,0)) at any time t e T'* and for any 
realisation G of the random distribution of material properties in a deterministic spatial subspace 
of small dimension n^, spanned by global basis vectors ((0i)ig|i,n^|) S (M"")"*. This so-called reduced 
space being determined, one could possibly construct a reduced order model for the fast solution of the 
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Realisation 1 
and problem 
definition 




Damage 



0.5 



Realisation 2 



Realisation 3 



Realisation 4 




Figure 6: Definition of the test problem of fracture in random particulate composites (top), and 
solutions corresponding to different realisations of this problem 
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problem with randomness. The idea is to look for (U(i, 0)) in the space spanned by the pre-determined 
global basis vectors. In doing so, the numerous spatial unknowns of the discrete lattice problem can be 
reduced to n^ amplitudes associated with the global shape functions {{(f>i)ieli,n^}), which potentially 
allows for orders of magnitude of gain in numerical efficiency. 

The scope of this article is not the construction of the reduced order model itself, for which we refer 
to the extensive literature concerning reduced order modelling in the linear and mildly nonlinear case 
(see the references in the introduction of this article). In the case of heterogeneous structures, and 
by extension in the case of nonlinear heterogeneous structures, the extraction of the relevant reduced 
space itself is not established. This first step towards the construction of a reduced model for fracture 
in random materials is the topic of the paper. 

3 Extraction of coherent structures by the proper orthogonal 
decomposition 

3.1 Proper orthogonal decomposition (POD) 

The proper orthogonal decomposition (POD) |3nillllll3 is a particular family of transforms that aim 
at extracting deterministic trends from randomly scattered data. Such transforms are powerful tools 
for the analysis of parametric problems and problems with randomness are used in numerous field of 
applications (see for instance jSO] and the review and analysis of the different variants of the POD 
proposed in |43j). In the context of the analysis of multivariate random processes (or time series), the 
POD proposes to approximate the random process of interest as a combination of simply structured 
random processes. In order to do so, the random process is expanded as a finite sum of orthogonal 
deterministic spatial vectors weighted by scalar random processes. As opposed to classical Fourier 
transforms, the deterministic functions are not defined a priori, but found a posteriori by solving an 
optimisation problem. 

We look for such an approximation of the stochastic time evolution of the nodal values of the 
displacement field in the damageable lattice structure. The POD expansion of U at order n^ is a 
random process in the form 



u : r'* X e ^ M"" 

(t,0) ^ ^^^a^(t, 



(41) 



where for all i G 11,71^], cj). G M"" is a deterministic "space" vector, while ai is the associated scalar 
random process, called "weight" . n^ < n„ is the order of the POD expansion. 

The POD proposes to look for the optimal approximation U that minimises a distance d(U, tJ) 
between the expansion and the exact random process of interest U. In order to define this distance, 
let us introduce the space S of multivariate random processes of n„-dimension random vectors defined 
on 7i and indexed in T''. A natural scalar product < . , . >s on S, and an associated norm || . \\s, are 
defined as follows: 

<x,y>s=eIy, x(t, efnt, e)]^Y. ^(^) f E ^(i)^Y(i) ) 

l|X||5 = (<X,X>5)^ , 

where P{d) is the probability for elementary event 9 £ Q to happen and X and Y are arbitrary 
elements of 5. Let us introduce the matrix notation \J 0. ai{t) — J^a(t), for all t e T'^. The columns 

of operator $ G K"uXn^ ^^.^ ^^iq deterministic space vectors ( ch ] , and a is the column vector 
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of scalar random processes (ai)^^^^ ^ j. With these compact notations, and caUing 5"* the subspace 



of S comprising aU random processes of form (41 ), the POD looks for an optimal decomposition (41 ) 
which satisfies 

U = argmin (d5(U,U*)) = argmin (||U - U*|||) , (43) 

under the constraint of orthonormality of the space deterministic vectors 

£^£ = I^ . (44) 



The solution to optimisation problem (43|44) requires the introduction of the POD operator (co- 



variance operator if the random process is centred) 

H = i?| y^mt,0)mt,0V\ ■ (45) 




A global minimiser of ( 43|44 ) is then given by the following POD expansion: 



• the scalar random processes are such that a{t,6) = ^ ]J_(t,9), for all {t,6) G T^ x Q, which 
implies that U(i, 6) = ^ j&"^U(i, 9) is the orthogonal projection of U(i, 6) onto Im(#), 

• the spacial modes ( (p ] are the eigenvectors of H associated to its largest n^ eigenvalues 

V-V»eli,n^l — 

(Ai)jg|x.„ ,| (H has Hu real and non-negative eigenvalues (Ai)ig|i „^j as H is real and symmetric). 

The approximation error (i.e. the error due to the truncation of the expansion at order n^) for a 
proper orthogonal decomposition U of U is 



d5(U,U) 



. E A.. (46) 



For the POD to be useful in practice, the error of the decomposition should decrease quickly with 
order n^. In other words, the state space of the solutions of the stochastic problem of fracture over 
V = T^ X 8 should be well-approximated in a low-dimensional manifold Im($) of M"". 

3.2 Empirical Proper orthogonal decomposition 

In practice, the POD operator cannot be computed exactly. It is replaced by the usual statistical 
estimate [30| 






(47) 



In the previous equation, O'* is a subset of O containing a relatively small number ng of realisations 
of the random distribution of material properties. The set of associated realisations of the random 
process {U(i, fJ.)} u n)er'^xe^ '^ usually called a snapshot. Let us define for simplicity V — T'^ x 6". 
The empirical spatial basis ^^ of rank n^ obtained by extracting the eigenvectors of H* associated to 
its largest n^ eigenvalues (Af)ig|i.„,| minimises the empirical POD functional 

J^(r) = -- E \Mt,0)-^'^^^mt,0)\\l, (48) 



under the constraint of orthonormality (44). Notice that the spatial modes are now random, the 



estimate H" being itself a random operator. The challenge consists in ensuring that a certain distance 
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between the image of the empirical projector ^'^ ^'^ and the image of the deterministic projector 
$ $ is smaU enough for the approximation to be of some use. 

Classically, the elements of the snapshot need to be normalised in some way to make the analysis 
easier, or to obtain results that are more consistent with the way the POD transform will be used 
later on. In our case, the elements of the snapshot of large amplitude tend to "attract" the empirical 
projector $* ^^ , which is an undesirable property. We here choose to modify functional (48 ) to avoid 
this inconvenience: 



J 



s.norm 



(r) 



1 1 

ne nt 



E 



U(t, 



|U(f, 



^s^sT mt,i 



lU(i, 



(49) 



The normalised empirical POD transform obtained in this way is a weighted POD of the original, 
non-normalised, elements of the snapshot as shown by the equivalent form 



J 



S.norm 



(*^) = — 1 V - 



|U(i,6')-U''(i,6i)| 



(50) 



where V (t, 9) E V, jj (t, 9) = ^* ^^ U(i, 9) is the orthogonal projection of an arbitrary realisation of 
the random process onto the empirical reduced space Im($'*). The empirical POD operator, modified 
by the proposed normalisation, reads 



H 



S.norm 



1 1 

ne nt 



E 



1 



mt.em 



U(t, 61) U(t, 



(51) 



Let us now define the truncation error for the normalised empirical snapshot POD of order n^ 



TS.norm 



(r) 



\ 



E A? 



(52) 



This error estimate provides information about the distance between the realisations of the random 
process and their projections in the empirical reduced space Im($*) for random distributions of the 
material properties that belong to sample space O". This property and its consequences in terms of 
predictivity will be discussed in the next subsection. 



We redefine in a similar manner the deterministic error estimate v (equation (46)) such that it 



incorporates the modifications proposed in this subsection for its empirical counterpart (normalisation) : 



J« 



'(*) 



\ 



E ^^ 



(53) 



where i^i)ieii.n^i are now the eigenvalues of H = £' — 2_] ipfjT — TJVTia — (^' ^) — (^' ^)"^ (™ '^®" 

~ \ "•* t^j-h lin(i, p)|l2 / 

creasing order), and the deterministic spatial basis $ composed of its n^ first eigenvectors minimises 



J"°™(*) = E 



1 y I 

nt ^^J|U(t, 0)111 



|U(t,6i)-U(t,i 



(54) 



teT' 



In the following, we will only work with the normalised functional, and therefore write J for J" 
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3.3 Estimation of the predictive power of the empirical POD by re-samphng 
techniques 

We now need to evaluate the predictive power of the empirical POD model for an arbitrary random 
realisation 6 (£ Q. This can be quantified by the expectation of the error of projection onto the 
empirical reduced space, which reads: 



i(*' 




u(t, e*) -*"*■' ^u(t, 



(55) 



However, for the same reason invoked in the previous subsection, J cannot be calculated, and needs to 
be estimated. The most straightforward estimator is J*. However, J* is a strongly biased estimator 
of J because ^'^ was chosen as to minimise J". More precisely, the expectation of J* is lower than 
J, and it is therefore said to be an optimistic measure of the predictivity of the empirical POD. This 
can be explained qualitatively. The POD model is "trained" or "fitted" on the set of realisations 
O^. Therefore, the projection error corresponding to these particular realisations is lower than the 
projection error of a set of arbitrary realisations O* C 8. This type of behaviour is called over-fitting 
(or error of type HI in statistics). One could then simply evaluate the statistical average of the error 
on 0* and report it as an unbiased estimate for J. This is called the holdout method, and is not 
usually favoured by statisticians as it requires to reserve realisations which cannot be used to increase 
the accuracy of the POD model. 

A widely used technique to obtain an almost unbiased estimate of J without the need of additional 
realisations is the cross-validation (see comprehensive reviews of re-sampling techniques in |34) . and 
their application to principal component analysis in [3S1[3S]). The idea is to simulate the prediction of 
errors on realisations that are independent of those used to fit the POD model (the so-called training 
set). To achieve this goal, we compute the error made successively on each of the realisations of the 
snapshot, while computing the POD with all the other realisations. We then sum up all the contribu- 
tions to obtain the cross-validation estimate of the prediction error. For each of these contributions, an 
empirical reduced basis ^''~' is computed without using the realisation whose contribution is currently 
being evaluated. This can be expressed mathematically as follows: 

1 1 ,^ 1 „ . _ 2 



J"(*" 



ng nt 



(t,e)€V' 



||U(i,( 



u(i, e) - ^'-^ (*' 



-9\T 



U(i, 



(56) 



where $^ is obtained for a particular realisation of the snapshot 6 £ Q^ 
subset POD functional 

^^ E — 

t<£T'\9ee=\{e} 



by minimisation of the 



jr'i^'^') = 



ne -1 nt 



|U(t,( 



u(i, e) - ^' " {^' 



^ij{t,e) 



(57) 



J*($*) is traditionally called PRESS (predicted residual sum of squares) in statistics. It can be shown 
to be slightly biased upward (pessimistic), as only ng — 1 oi the elements of the snapshot are used to 
compute each of the individual projection errors. The version of PRESS given by equations ( 56|57 l 
is called "leave-one-out" (LOOCV). Popular versions of the cross-validation, called n-folds, divide the 
realisation space O* into n subsets (0^)„gii „! of roughly identical cardinality. The cross-validation 
estimate J'^ is then obtained as described previously, but removing a whole subset of elements of the 
snapshot to fit the successive empirical POD basis (^^^(9^)) _„, , used to calculate the elements 



of the sum over the realisations in (56). 



leli, 



In our numerical experiments. We will use the 10-fold CV when ng > 20, and LOOCV otherwise. 
For a justification of this classical choice, the interested reader can refer to |M] . 
We define the error estimate provided by the cross-validation 



Ji^^. 



(58) 
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3.4 Locally uncorrelated solutions in the case of stochastic fracture 



bO 

o 



Ph 




10g(^^)^s^ 



n0 increases 



Order of the POD transform n^ 

Figure 7: Statistical estimates of the error of projection of the empirical POD as a function of the order 
of the decomposition. The error curves in dashed line are evaluated directly on the samples used to 
obtain the POD model. The experiment is reproduced for different number of random realisations. The 
error curves in plain line are the convergence curve of the cross-validation estimate of the projection 
error as a function of the order of the decomposition, for snapshots of different cardinal. 

In the case of fracture mechanics, the predictive power of the empirical POD is too poor to be 
used in analysis or reduced order modelling. This is shown by the numerical results in figure [7j We 
show, in plain lines, the convergence of the cross-validated error estimate P" as a function of the order 
of truncation of the POD projector, n^. This is done for different sizes of sample space ng — 8, 
ng = 16 and ng = 32. We can see that for a given order of the transform, the error estimate decreases 
slowly with the number of sampled realisations. More interestingly, each of the convergence curve 
decreases slowly for n^ < 5, and then flattens. The level of accuracy obtained is not satisfactory for 
the construction of a predictive model, where an accuracy of typically less than 10"'^ is desired for 
each of the predicted realisations. Of course, P^ is a point estimate, and confidence intervals should 
be added to show the dependency of the results on the particular set of realisations used as snapshot. 
However, we only wish to show the overall behaviour of the POD transforms, and stating that similar 
trends have been observed with different sets of realisations of same cardinality is sufficient at this 
stage. 

Further understanding can be provided by plotting the biased error estimates v'^ for different 
number of sampled realisations ng. The corresponding convergence curves are plotted in dashed lines 
in figurqTJ We can see that the behaviour of one particular realisation (ng = 1) is captured with a high 
degree of accuracy h'^ « 10~^ with 5 or 6 spatial vectors. Therefore, the time response of the structure 
for a given distribution of the material properties is highly correlated, and can be approximated in a 
low-dimensional manifold of dimension 5. However, when performing the same experiment with two 
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such realisations, the size of the reduced space required to approximate the two time evolutions is 
doubled. This behaviour still shows when further increasing the number of random samples, and leads 
to the flattening curves observed for the objective estimate D^. Notice that when ng tends to infinity, 
one should indeed see that i''^ tends to i^* , providing that the bias of the cross-validation strategy is 
small enough. Physically speaking, the trends observed here suggest that each of the time evolutions 
of the random process requires its own reduced space, which is relatively independent on the space 
generated for the other ones. The stochastic fracture process is therefore uncorrelated. This is not 
surprising, looking at the various crack paths represented in figure [6] A subspace representing one of 
the time evolutions of the damage will not necessarily help predicting other crack paths. 

Local contributions to the POD functional 



2.3x10-= 




Local contributions to the cross-validated POD functional 



2.7x10-3 




Figure 8: Local contributions to the error estimate of the empirical POD 



However, we can show that this issue comes from very localised uncorrelated data in space, in 
the region were the different cracks propagate. To show evidence of this fact, first notice that the 



functional that is minimised by the empirical POD transform ( 50 ) can be written as a sum of local 



contributions in space. Let us define the normalised difference between a time instance of an arbitrary 
realisation of the random process and its projection in the subspace generated by the empirical POD 
as: 

\f{t,e)ev, e{t,e)= „^,,/^,ii {ii{t,e)-tf{t,9)) . (59) 



Then, equation ( 50 1 can be written as 



\mt,o)\\2 
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(60) 
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Defining vector the vector of local contributions to the empirical POD functional by Q G M"" by its 

/ 1 1 \ '^' 

components Qi = 2. ei(ij^)^ : where i G |l,nu] we obtain 



na rit 

it.e)€V= 



j'i^i^imi (61) 



The same vector of local space contributions can be defined for the cross-validation estimate of the 
POD functional J*. Both vectors are represented in figure [s] The deformation of the mesh show the 
amplitude of the components of Q associated with each node of the lattice structure in the x and y 
directions, while the color scale represents the local contributions per node (square root of the sum 
of the square of the local contributions along x and y). 16 random realisations have been used as a 
snapshot in the present case. The snapshot POD is truncated at order 5. It clearly appears that the 
major contributors to the empirical POD and empirical cross-validated POD functionals are localised 
in a narrow region; the process zone. 

3.5 Weighted POD to improve the convergence of POD transforms 

The weighted POD transforms (or more precisely POD transforms with a weighted inner product) are 
classically used to improve the convergence of the POD transform. A typical weighted version of the 
normalised POD defined previously minimises the following weighted distance: 



d^(u,u) = ;^^5: -^;^iiu(M)-u(M)iiL (62) 



-* ,^J|U(M)I 

where V(f , 0) E V, JJ_ ^ ^Q,{t,d), and the weighted space inner product is defined by ||X||l = 

wX LX, with L a diagonal, definite positive matrix. Adjusting space weights L may permit to 
retrieve a fast convergence in space when locally uncorrelated data prevents it, which is of particular 
interest to us. Similarly, the scalar weight function w defined over V might allow to enhance the 
convergence properties of the POD when locally uncorrelated time instances or random realisations 
prevents the construction of a low-dimensional attractive subspace. 

A weighted version of the empirical POD can also be defined by minimising functional 









In addition to the previous remarks on the potential usage of these weights, to may allow to reduce 
the effect of outliers (extreme values of the random process) from the analysis, by an approximation of 
their respective probability, therefore allowing the empirical POD to converge faster towards the POD 
in terms of size of sample space. Although this particular feature is also of interest to us, we focus in 
this work on space weights L. 

Intuitively, we would like to set the entries of L that correspond to the process zone to very small 
values. How small these weights should be is obviously a difficult question to address. Too small 
weights would lead to numerical ill-posedness of the POD minimisation problem, while too large ones 
would pollute the results in the zones where the random process is correlated. We propose to extend 
the idea of weighted POD transforms by performing a POD decomposition of the data corresponding 
to only a subset of the spatial degrees of freedom. This set of degrees of freedom, called spatial domain 
of validity of the POD transform, are the degrees of freedom for which the local contributions to the 
POD metric (Qi)jg|ri „ ii is small enough (broadly, the space components for which the POD "works"). 
In turn, obtaining these local contributions requires the knowledge of an initial POD transform. Hence, 
we arrive at the idea of an iterative process to find an optimal POD-type decomposition associated 
with a domain of validity, which is in substance what is proposed in the following section. 
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4 Progressive Restricted POD 

4.1 Limit case of Weighted POD: the Restricted POD 

The proposed restricted POD performs a classical POD of the spatial components of the snapshot sam- 
ples corresponding to a high level of correlation (i.e. away from the process zone). These components 
will be denoted by superscript ^'^K This transform being computed, a POD-type decomposition of the 
remaining components (corresponding to the damage zone), denoted by ^f' can obtained by solving a 
complementary minimisation problem (i.e.: a regularisation of the weighted POD problem). 

Formally, let us decompose the euelidean space M"" as a product of two spaces: M"" — M"*^ x M"J' , 
with Uu = rir + Uf . Any vector X G M"" will be decomposed as follows 

X = (x(''',X^-^'') e M"" X M"^ . (64) 

We introduce the boolean restriction operators E*" e {0, !]."'■><"" and E-* e {0, !]."/><"" as 



X(^') = E^X 



(65) 



Only one entry per line of rectangular operators E*^ and E^ is equal to one; the remaining entries are 
null. Hence, X G M"" can be written as a sum of two contributions 

X = g'-^X^'') + W^X!-^'^ . (66) 

Hence, P*^ = E^ E'^ is a diagonal orthogonal projector which, when applied to arbitrary vector X, 
sets the entries of X corresponding to the process zone to 0, while the entries of X that correspond to 
the domain of validity of the POD are unchanged. A similar remark leads to the definition of projector 
£/ = g/^g/ and we have £■'' + E'^ = I^. 

We directly describe the restricted POD in its empirical version, the deterministic counterpart being 
obtained by simply replacing the sum over the samples by an expectation in the distance expression 
given below. We therefore look for a decomposition of the sample realisation of the random process in 
the form 

U' : V ^ M"" 



where af{t, 0) is a vector of weight functions depending on time and realisation index 9 of the random 
process. Our goal is to find a basis $*, and a restriction operator E*^, which minimise a certain cost 
functional. To arrive at the formal expression of such a problem, let us first assume that an optimal 
splitting of M"" is known. The restriction of the transform to the '-''•' degrees of freedom U* is then 
required to minimise a distance with respect to U'^''^'"^ given by 

The optimisation is constrained by the orthonormality condition 

This transform is a classical empirical (normalised) POD in M"*^. Its solution is as follows: 

• (0'''''')»elin.l are the eigenvectors of H'' = — — V ^ E''U(i, 6*) U(i, 6')^E''^ 

associated to its largest n^ eigenvalues (Ai)ig|i „^|. 
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• V(i, 6*) e P^ a%t, 9) = £^^M''^M'^ U(t, 6) (the weights are not defined by the empirical transform 
for 9 ^ 0"). a" is completely determined, for any 6* e 8*, by the minimisation problem on the 
'^'"^ degrees of freedom. Hence we will use hereafter the notation a*''' — a'*. 

We will from now on use the notation v'^''^ — y cf*'''(U^'''', U"' ^ ) for the error estimate correspond- 
ing to this transform. As for usual POD transforms, this estimate is the square root of the sum of the 
ordered eigenvalues of H'^ with indices larger than the order of truncation n^. 

~ s if) 

We now discuss how to obtain the complementary part of the decomposition U ' . This is not 
a fundamental ingredient of the statistical extraction, as the previous decomposition associated to a 
given partitioning of the degrees of freedom can be performed in a "stand-alone" manner. However, a 
prolongation of the reduced basis to the process zone can be useful when the aim is to define a global 
preconditioner, as in our previous investigations on the subject for instance [IHllllI- If necessary, the 
complementary part of the decomposition may be obtained by minimisation of the following functional: 

j^,/($«.(/)) ^ 11 J2 ,..] \\ll^fHt,9)-^^^^f^a^^^it,9)r2. (70) 

The previous problem can be interpreted as follows: we look for a prolongation U*--'-' of the basis 
U^'^^ such that if the projector is fitted on the reduced degrees of freedom, the global projection error 
(comprising the process zone) is minimal. The solution to this problem is given by: 

£-,(/) ^£^-1^ (71) 

where£= (g-^ J2 U(f,6') a''''^(i,6l)^ and 7 = ^ a''''{t,9) a''''{t,9f 
\ (t,e)e-p= / ^ (t,e)<£V= 

Solving the complementary minimisation problem can be seen as a regularisation of a classical 
weighted POD, when the space weights L are 1 for the '^''^ degrees of freedom, and for the '^^'^ 
remaining ones. An error estimate can be defined for this part of the solution, but is not used in the 
following, as by construction, the ^f' degrees of freedom carry most of the error, and the corresponding 
reduced basis cannot be used solely as a predictor. 

We are now able to express the problem of the restricted POD in full. We look for a transform of 



type (68) which minimises the associated error estimate v^'^, under the constraint that the extractor 
E*^ is of given rank n^ (in other words, the size of the process zone, characterised by n/ = ri„ — n^, is 
fixed in advance). In the following section, we propose a greedy algorithm, which provides a suboptimal 
solution to this problem. The choice of the best rir for our practical application is then obtained by a 
post-treatment of a range of solutions of the restricted POD corresponding to various n^. 

4.2 Computation of Restricted POD and associated validity domain with 
a greedy algorithm 

4.2.1 General algorithm 

The Greedy algorithm proposed in this section can be viewed as an element of a general class of 
iterative algorithms in two steps. We first solve the POD problem, given the functional to minimise 
(i.e.: the weights the restriction in our case). We then update the parameters of the functional such 
that the decrease in the error in the POD model is as large as possible. This second step is performed 
using the projector given by the first step, and under a set of constraints defining how these parameters 
can be updated at each iteration. The iterative algorithm is stopped when a certain target, or a certain 
minimum, is achieved. 

Let us now apply this general concept to the restricted POD described in the previous section. We 
introduce the functional >^''(£", W) = #^''(U(''\u''^''^), where U"'^''^ = g^'"-^ g''"- ■U<^'-\ and for a 
given order n^ of the decomposition. The greedy optimum ($^,E'^) is obtained in the following way: 
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0. Initialisation. Initialise the algorithm with E"" = I , where I is the identity operator in IR""^"-"^ 
and set i = 1 

1. POD. Find a reduced basis ^"''^ that minimises J^''"(#''.'^, W. ), and compute the prolonga- 
tion ^''.'•' by minimising the complementary cost function J^'^{^^:\ E „,) — •^'''•'(^'' ), where 
j«J($^'/) is evaluated for fixed extractor E-* 

2. Stopping criterion. If the stopping criterion is reached stop the greedy algorithm and return 
suboptimal solution (^'*,E''._ j. 

3. Update of the POD functional. Compute an update AP'! = P*^ — P^. of projector P'', such 

that the rank of E*" is reduced and J*'''(j^*., E'!) is minimum under some constraints on the 
admissibility of the parameter update. 

4. i ^ i + 1. 

Step 2 has been defined in the previous subsection. Our choice for the initialisation of the algorithm 
is to perform a classical POD on the whole spatial domain. We now need to clarify step 3 and 4 of 
the algorithm. 

4.2.2 Update of the extractors (Step 4 of the Greedy algorithm) 

At iteration i of the greedy algorithm, we first look here for an optimal update AP '^ of rank 1 of the 
POD functional. In other words, we set one non-zero entry of the {i — lY^ iterate of projector P*" to 
0. This entry, indexed j, corresponds to the maximum spatial contribution of the POD error, which 
formally reads 

j^argmax ^ e'(t,e) , (72) 

where V(i, 9) e V, £j{t, 9) is the j"^ component of residual vector: 

^(^'^) = 1-;^M^ fe-Xi^i(j.-£: fe)' {MuYMU)mt,9) (73) 

We have here made use of the work explained previously, which allowed us to rewrite the POD error 
as a sum of local contributions over space. 

Notice that in order to find index j of the space degree of freedom to be added to the process zone, 
the order of the POD (the rank of ^), needs to be defined. In our implementation of the strategy, we 
perform separate greedy computations for n^ ranging from 1 to a value specified by the user, and try 
to extract a good (or optimal in some sense) value of n^ in a post-treatment phase. 

Looking for a rank-one update of the projector P*^ can be computationally expensive if the process 
zone contains a large number of degrees of freedom. Therefore, in a second stage of the update, we set 
to zero all the entries of P'^ that correspond to nodes of the lattice structure which are located in a 
sphere on radius p centred on the node carrying entry j. This particular feature also permit to obtain 
a compact modification of the process zone at each step of the greedy algorithm. 

4.2.3 Stopping criterion 

We simply stop the algorithm when the rank of the i^^ iterate of projector P*^ is smaller than a value 
specified by the user. As for the choice of the best truncation order of the POD, the extraction of 
an "optimal" rank of the projector is done in a post-treatment phase which shall be described later on. 
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Error as a function of the order of the empirical POD 
without domain decomposition 




Error as a function of the size of the 
process zone for a given order of the empirical POD 

Figure 9: Estimate of the prediction error of the restricted POD model as a function of the number of 
modes the number of spatial degrees of freedom over which the spectral decomposition is not performed 
(i.e. the "size" of the process zone) 

4.2.4 Illustration 

Let us illustrate the ideas presented in this section by analysing the results given in figure [9] This 
figure shows the value of the empirical error estimate i/"''" as a function of both the order n^ of the 
decomposition and the rank of P-* . For each grid point of the surface represented in figure 9 a greedy 
computation as been performed to obtain both the POD projection and the associated aomain of 
validity. We used a snapshot containing ng = 16 realisations of the random process of fracture. The 
rank of P is described in a normalised form, 



Pf 



rank ( P-* 



(74) 



As the mesh of the lattice structure is regular, P^ approximates the ratio between the surface area of 
the process zone and the total surface of the structure. The curve corresponding to P-^ = is the one 
displayed in figure^ for ng — 16 (empirical POD over all the unknowns). 

Two general tendencies appear in this graph. The error estimate v'^'^ decreases with increasing 
relative size of the process zones P^ , and with the order of the POD n^. Notice that for a fixed 
size of the process zone, the error does not necessarily decrease monotonically. Indeed, the domains of 
validity of the associated solutions, although of fixed size, might differ. Therefore, one cannot apply the 
classical results on POD transforms, as two successive solutions do not necessarily work on the same 
data. However, due to the definition of the greedy algorithm given previously, the error does decrease 
monotonically with an increasing size of the process zone, for a fixed order of the POD approximation. 

In more details, one can see that providing that the order of the decomposition is sufficiently large 
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to allow for the extraction of a good spatial basis for the random process, the error first decreases 
quickly when increasing P^ (between around and 10 % in this example), and then continues to 
decrease but at a lower rate. Conversely, providing that we remove a sufficiently large part of the 
uncorrelated spatial degrees of freedom to perform the POD, the error decreases quickly with the 
order of truncation, and then tends to stagnate (around n^ = 5 in this example). This indicates that: 

• providing that the process zone is excluded from the correlation analysis, the spatial dimen- 
sionality of the attractive subspace of the problem of stochastic fracture is low and relatively 
well-defined. It also leads to an average prediction error which is acceptable for an engineering 
application (around 10"'^). 

• The effect of the random crack paths is relatively localised. A small number of POD modes 
(smaller than the number of realisations of the process) permit to represent the random process 
with a high level of fidelity away from a relatively narrow region. 

• The definition of the process zone is not obvious. The error decreases continuously with P-* . 

We took some precautions when stating the previous conclusions, as the analysis is in fact restricted 
to the particular sample which has been used to construct the reduced spaces and associated domain 
of validity. In fact, i/"''" is an optimistic measure of the predictive behaviour of a model constructed 
by the empirical restricted POD (it is biased downwards). Therefore, a re-sampling strategy needs to 
be defined in order to produce an objective confirmation of these observations. 

4.3 Cross-validation of the restricted POD 



»^</, 




P^ (%) 
Figure 10: Cross-validation estimate of the projection error of the restricted POD 
The cross-validation estimate P"^'' of the restricted POD is an almost unbiased estimate for the 
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expectation of the prediction error made when using the empirical restricted POD model 
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An important feature needs to be highlighted here. Let us recall that the cross validation simulates 
the prediction of independent realisations by the empirical model. In order to achieve this, we ignore, 
in turn, each of the observations of the snapshot from the set used to fit the POD model. We fit 
the model on the amputated snapshot, and measure the error made when predicting the realisation 
which has been ignored. All the individual errors are then summed up to obtain the cross-validation 
estimate. In the case of the restricted POD, building the model means extracting an attractive spatial 
manifold of specified dimension, and an associated domain of validity. Therefore, in order to obtain an 
objective measure of the predictive ability of the empirical restricted model, we need take into account 
this remark in both the POD step and the update step of the greedy algorithm. More precisely. 
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Throughout the greedy process, the extractor E'^^ is updated without using the realisation of the 
random process corresponding to random distribution 9 of the material properties. At iteration i, 
E*!" is therefore updated from its previous iterate E'!" by deleting line j of E*!" (and the lines 
corresponding to nodes located in a sphere centred on the node carrying space degree of freedom j), 
where 
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where e ■ is the j component of residual vector: 
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The cross-validation estimate v"'^ is relatively tedious to implement, and the technical aspects will not 
be detailed any further. Just notice that its computation requires, by induction, to perform the greedy 
algorithm separately for each of the realisations that are successively removed from the training set of 
samples, and only then sum the contributions of the successively ignored data to the error estimate. 



In figure 10 we plot the cross-validated estimate v^ obtained when using the cross-validated greedy 
algorithm to construct a restricted POD model. Notice first that the trends described for the evolution 
of J^*''' as a function of P^ and n^ are still valid for i/''. However, the flattening effect as the order of 
the decomposition increases is more acute, as the cross- validated estimate does not overfit the samples 
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when 710 becomes large. More importantly, when the size of the process zone is relatively small, and the 
order of the decomposition gets large, the cross-validation estimate of the projection error increases. 
This effect is well-known to statisticians. The cross-validation error estimate is indeed an (almost) 
unbiased estimate for V^'^ . However, the process zone and complementary reduced space are both 
identified on the same samples. These two quantities tends to be arranged, through the minimisation 
process, in a way that minimises the projection error over these particular samples. Hence, an other 
kind of over-fitting behaviour is obtained, which usually results in a low convergence of the solution 
with the number of realisations. 



4.4 Double Cross-validation of the Restricted POD 

To avoid the inconvenience mentioned in the previous subsection, the usual remedy is to perform 
a double cross-validation [l^. The double cross-validation estimate proposes to couple the cross- 
validation and simple holdout methods. The sample space O'' is split into two complementary groups 
O^ and O^ (of roughly equal cardinality in our implementation). In the first step of the cross- 
validation greedy algorithm, the projector is identified using only the realisations corresponding to 
sample subspace 01. Similarly, in the step of update of the process zone, the projection error used to 
find a maximum local contribution is only computed using the realisations corresponding to random 
distributions in 8 p. 




10 15 20 25 30 35 40 45 



Pf (%) 

Figure 11: Double cross-validation estimate of the projection error of the restricted POD. The dashed 
line is the number of modes that are selected for different size of the process zone. The process zones 
obtained at the circled points are the one represented in figure [12] 



The result of this procedure is given in figure 11 The snapshot comprising ng = 16 time realisations 
of the random process has been randomly split into two groups 01 and 0|;, each composed of 8 distinct 
elements. One can see that for small sizes of the process zone, the level of error achieved with the 
double cross-validation technique is much lower than with the "single" cross-validation. 

We now are in possession of a very useful tool for the analysis of random process with locally 
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uncorrelated data, given a set of representative solution samples. These results need now be post- 
treated to select an appropriate extractor of the space degrees of freedom corresponding to correlated 
data, together with a projector of rank n^ onto a representative reduced space. Further analysis would 
be required to assess whether a sufficiently large number of samples have been pre-computed in order 
to: 

• estimate the statistical error in the cross validation estimate 

• estimate the dependency of the random POD model on the snapshot. In other words, can further 
(or other) samples increase the predictivity of the POD model. 

We give in the following some indications on how to perform these tasks. 

5 Analysis and results 

5.1 Problem dimensionality away from the cracks 

Given the error map[TT] one can now decide, for a given size of the process zone, of the dimensionality 
of the problem corresponding to the reduced degrees of freedom. The general problem of identifying 
the meaningful components in an empirical POD model is a difficult one. It has been addressed in 
numerous publications (see the reviews and test studies proposed in [3Sll3Sj). Broadly speaking, |33] 
seems to indicate that all proposed indicators (eg.: elbow tests, tests based on the stability of the 
eigenvalues of the POD operator like the broken stick or techniques based on bootstrap confidence 
intervals, and tests based on the rate of predictivity of the model as a function of the number of 
modes) perform well for a specific range of problems. The author recommends to use several of them 
in order to avoid relying on an indicator which would be out of its domain of application. We follow 
this idea. However, our problem differs from that of a classical proper orthogonal decomposition. For a 
given size P^ of the exclusion zone, the models obtained for different order of the POD approximation 
are fitted on different sets of data. The corresponding greedy iterates of extractor E*^ might differ from 
one order to the next, despite the fact that their ranks are equal. Therefore, the use of tests based on 
the stability of the eigenvalues of the POD operator is hardly justified. However, we found that tests 
based on the decrease rate of the predictivity of the model perform well is this context. 

The two indicators used in this work are reviewed in (42| . The first one identifies the order at which 
the overfitting behaviour of the empirical POD becomes too large compared to the decrease rate of the 
cross-validation estimate of the projection error. The number of modes selected using this criterion is 
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In the above formula, pff is the (double) cross-validation estimate for a restricted POD model of 
order n^, while i^f^ ,i is the value of v'^''' (biased estimate of the projection error) for a model of order 
fi^ + l. It is usually argued that this criterion tends to underestimate the number of modes that should 
be retained. Indeed, an overfitting behaviour does not necessarily imply that the corresponding modes 
cannot yield a significant decrease in the projection error. The second criterion used in this work 
measures the rate of predictivity of the POD model as a function of the number of modes. This rate 
is compared to the rate at which statistical degrees of freedom are fixed by the tensor approximation 
of the samples (for more details, see [46j). The number of modes selected using this criterion is: 
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In the last equation, Ai^ — — (the expression of this quantity has been 

(n„ - n^) + [no - n^) - 1 

adapted to the version of the POD at use in this paper, which works with samples that have not been 

centred prior to the spectral analysis). This criterion is usually found to be more conservative than 

the previous one. 

Using an average of both these cross-validation based criteria performs well in our example. For a 

given size of the process zone, we set n^ to the number of modes that corresponds to the average of 

the cross-validation estimates obtained with a restricted POD of order nj and nY . The result of our 

indicator is plotted in figurc fTT] Its validity will be commented using qualitative arguments later on. 

5.2 Spatial exclusion zones 

Once the dimensionality of the random process is determined, depending on the size of the excluded 
spatial domain, one can extract one particular POD model and its associated process zone. We 



recall the observation that in figure 11 the cross-validation estimate of the projection error decreases 
continuously with the rank of the process zone extractor. There is indeed a narrow region where the 
rate of this trend is higher than anywhere else (between P*" = and P*" — 10%), which corresponds 
to all random cracks being excluded from the domain of validity of the restricted POD. However, the 
effect of the cracks is "seen" by the structure even far from this narrow region. 

Therefore, process zones need to be defined for a given accuracy of the POD model. The choice 
of one particular model among those represented with the dashed line in figure [TT] has to be done 
according to practical considerations: the selection is a trade off between accuracy and numerical costs 
for the "offline" phase of the reduced order modelling technique (i.e.: the phase where the model 
will actually be used to reduce CPU time in further computations). A large excluded domain would 
mean a large part of the spatial degrees of freedom for which no reduced space is available (or only 
the regularisation proposed in section |4.1[ whose associated projection error is potentially large and 
not controlled in any case). Conversely, a small excluded domain yields inaccurate predictions in the 
reduced space built for the remaining degrees of freedom. If one knows what the minimum required 
accuracy is, one can select the model that gives this value, for a maximum size of the domain of validity. 
Such a model, associated to an error of 10"'^, is shown in figure 11 However, in the general case, one 



would have to specify the way the reduced model would be used, and define and objective function 
that depends on both the speed-up provided by the reduction and the accuracy. Once this objective 



function is defined, its minimisation over the admissible models (dash line in figure 11) would lead 
to the selection of the desired domain of validity. How to use the model will be elaborated on in the 
discussion section. 



Figure 12: Process zones obtained by using the restricted POD. Increasing the size of the process zone 
decreases the error of projection of the POD model that is fitted on the remaining spatial degrees of 
freedom. 

For purposes of illustration, we show different process zones, associated with increasing values of 



P"^ in figure 12 (POD models at circled points of the map in figure 11 ). The basis vectors obtained by 
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the Restricted POD are also depicted in figure [T3| for P^ corresponding to a cross-validation estimate 
of the projection error equal to 10^"^. On can clearly see that the 6 first modes are associated with 
global deformation of the domain of validity of the reduced model. The 7*^ (not retained) can be 
qualified as "noise" engendered by the proximity of the damaged region. This gives us a qualitative 
confidence in the criterion used to find dimensionality of the problem. 
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Figure 13: Modes obtained with a restricted POD performed on 16 realisations of the random process 
(right column), truncated at order 6 (the last vector displayed in this picture is the first mode which 
is not selected by the criterion used in this paper), and for a size of the process that leads to a 
projection error of 10~^ over the complementary spatial domain. In the left column, the POD modes 
are represented with the regularisation proposed for the spatial components that belong to the process 
zone. 



6 Potential applications and improvements 

6.1 Reduced order modelling and POD-based preconditioners 

From the previous section, it clearly appears that the tools proposed in this paper will have to be 
adapted to each specific application. In particular, the effort made in the "offiine" and "online" phases 
of the selected strategy will have to be balanced in terms of error estimation. We discuss here two 
different types of potential applications. 

• The restricted-POD as a preconditioner. Reduced-spaces obtained by the POD are a good 
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alternative, or complement, to Krylov-based subspaces for the deflation of iterative solvers, and 
in particular when the problem is characterised by large changes in the tangent operators [501 US] • 
Using such a reduced space as a preconditioner is not particularly difRcult. As the reduced 
space is merely used to accelerate linear solvers, separating "offline" and "online" phases in not 
necessary. The restricted POD model can be computed, or updated using cheaper algorithms, 
while considering as samples all the previously computed solutions. 

• The restricted-POD to construct a reduced order model. We look here for a cheap approximation 
of the (nonlinear) Schur complement of the reducible spatial domain onto the process zone, as 
proposed in various papers [231 HSl HI] • In the case, the proposed tool identifies a reduced space 
in which the Schur complement of the problem with randomness can be projected. However, 
proper error estimates need to be established. One has here the choice between computing a large 
number of samples, and using statistical tools to obtain confidence intervals for the projection 
error, or developing a priori or a posteriori error estimate that are reliable and cheap enough to 
be applied "online" phase (some of the work related to error estimate of nonlinear reduced order 
models can be found in [Ul [T71 HI] ) ■ The former approach is preferable if interactivity is the goal 
of the reduced model. However, this is not clear if the purpose is to obtain some statistics of the 
problem in a minimum amount of time. One then has to balance the effort made to increase the 
predictivity and confidence of the model in the "offline" phase, and the cost of "online" error 
estimates and updating procedures. This is a difficult issue, notably as both statistical error 
estimate for the POD, and error estimates for general nonlinear reduced order model, are not 
yet available, which is further discussed later on. 

6.2 Statistical extraction and reduced order modelling for general struc- 
tural heterogeneities 

Let us recall that in the example studied throughout this paper, the elastic properties of the composite 
material have been assumed constant. Indeed, non-homogeneous elastic properties imply a global lack 
of correlation. The deformation of the structure within each phase cannot be represented accurately 
within a low-dimensional subspace. Therefore, a fine-scale correction of a global POD basis would 
be necessary for each realisation. We do not believe that this is of significant difficulty for localised 
nonlinearities (for instance in quasi-brittle). One can use the restricted POD proposed in this paper 
to obtain an empirical model, based on random samples. In the "online phase" , one can then add to 
the global basis the residual of the predictions performed with the empirical model, and this for a few 
of the time steps of the analysis. However, for global nonlinearities (e.g.: large deformations, ductile 
fracture) , this framework might be jeopardised as finding a subspace for the residual of the empirical 
POD model might require to solve all realisations exactly. One possible avenue for the a priori reduced 
modelling of general heterogeneous structure might be the use of mappings from realisations of the 
random heterogeneities to a reference configuration. 

An other issue to consider in the context of reduced order model of nonlinear problems is that 
significant speed-up cannot be achieved within a Galerkin framework. One has to devise strategies 
such as Petrov-Galerkin techniques [HHISIITH], or trajectory interpolations ^35]. In the former case, 
the nonlinear behaviour is somehow extrapolated from a point of the domain to its surrounding, hence 
reducing the costs of numerical integration requires to evaluate the residual of nonlinear problems. 
Such a methodology, in the context of heterogeneous structures, is far from being obviously applicable. 
In the second case, the local nonlinear behaviour is interpolated over sampled values of the weighting 
functions associated with the reduced basis vectors. In addition, our recent investigations show that 
the reducibility of a problem is not directly related to the dimensionality of the ensemble of possi- 
ble solutions, but to the dimensionality of its image by the nonlinear operator of consideration |28j . 
Therefore, this image should be identified and analysed, possibly with the tools proposed in this paper. 
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6.3 Evaluation of the sensitivity of the statistical model to the realisations 

As mentioned earlier, the selection of the process zone depends on an objective function which defines 
the trade-off between the cost and accuracy of the method. Once a threshold of accuracy is selected 
by the user, the algorithm computes a greedy optimum (^^,E'^), which minimizes the cost function. 
Since the optimum depends on the random set of samples 0* and the selection of the complementary 
sets 81 and 9|;, it is desirable to determine the sensitivity of the optimum to these realisations. 

This could be accomplished by the method of "Repeated Double Cross Validation" introduced 
in |50j . The idea of this method is to repeat the double cross validation nuep times with different 
splittings of the set 9" in 81 and 8|;. This yields nj^ep greedy-optimal pairs (^'',E'^), from which the 
variability of the outputs can be estimated. The number of repetitions nj^^p depends on the complexity 
of the model studied, values between 10 and 100 have been suggested in the literature [5U1I51) . 

In order to analyse the convergence of this method, we need to define a distance function which 
induces a topology on the space of outputs considered. A possibility for this metric is the HausdorjJ 
distance, which gives the distance between two subsets of M" . First one defines the directed Hausdorff 
distance from set A to set B by: 

h(A, B) := max I min d(a, b) 

where d{a, b) is the Euclidean metric between two points a and b in M". We note that this distance is, 
in general, not symmetric, that is h{A, B) is not equal to h{B, A). The undirected Hausdorff distance 
between the sets A and B is defined by: 

H{A, B) ■- max {h{A, B),h{B, A)) . 

The Hausdorff distance is commonly used in object matching and pattern recognition [52] . Here it can 
be used to measure the variance between the process zones obtained by different realisations of the 
model. 

We are particularly interested in "confidence zones" for the support of the restriction operator E*". 
These could be easily extracted from the frequency with which each lattice node is included in the 
process zone or its complement. For example, if nuep — 10 and a particular node is included in the 
process zone at least 9 times, then we can include it in a 90% confidence area for the process zone (or 
10% confidence for the complement). 

This method has the advantage that the samples used for calculating the projector and not used 
for the projection error (i.e. the training data and the testing data) are kept separate. This eliminates 
too optimistic error estimates due to over-fitting and allows us to study the variability of the predicted 
process zone. If the confidence zones obtained show too much variability, this probably indicates that 
the number of samples ng is too small and that a larger number of samples is needed. 

It is important to realise that these re-sampling techniques always rely on the available realisations 
to evaluate the variability in the statistics of interest. Therefore, a sufficiently large number of reali- 
sations of the random process is required, the term "sufficiently" being extremely difficult to quantify 
as it requires to evaluate the propagation of variabilities through a highly nonlinear structural model. 

6.4 Advanced statistical surrogates and extension to parametric/stochastic 
problems 

From a conceptual point of view, the method described in this paper is a tailored weighted POD, with 
an adaptive choice of weights assumed binary. It would however be of interest to consider a transition 
zone between the process zone, where the lack of correlation is such that no representative reduced 
space can be identified, and the zone where the data are assumed reducible. Weighting this region will 
permit to take into account the variability of the solution in this transition zone, and its identification 
would permit to give identifications about where the region where "online" error estimation should be 
performed. 
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An other important remark is that we did not make the best solution samples in this work in 
the sense that we did not segregate spatial correlation in time (within each realisation of the fracture 
problem) , and spatial correlation due to the variability. We can expect that smaller process zones and 
lower level of projection errors would be obtained by performing the Restricted POD at each time step 
of the analysis. The idea of performing spectral analysis using a separation of variables is essentially 
what proposes the proper generalised decomposition (see for instance |53l 154)). However, such an 
extension of the proposed work will require to first define what time is in a quasi-static context, so 
that random solutions at a given time step share physical similarities (e.g.: crack "length"). 

The method proposed in this paper is illustrated in the context of problems with randomness. The 
randomness that has been considered in our test case is driven by engineering observations, not from 
mathematical considerations, which makes the validation of the dependency of the POD model to the 
particular set of realisations at hand difficult. However, the application of the concept proposed in this 
paper is not limited to statistics, and could be easily extended to parametric and stochastic problems. 
In these two contexts, formal or practical error indicators for the POD surrogate would probably be 
easier to obtain. 

7 Conclusion 

We proposed a method based on the statistical analysis of solution samples for extracting the process 
zone of a problem of fracture in random heterogeneous media. We showed that the definition of the 
process zone is not unique, but is in fact parametrised by the level of non-correlation of the random 
process which is obtained in the complementary part of the domain. The method generates a reduced 
space for this complementary part, in which the solution to the problem of fracture in random composite 
is optimally approximated. We showed the potential of this novel strategy as a tool for the analysis 
of localised physical phenomena and as a means to speed up parametric problems via preconditioning 
or reduced order modelling. 
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